Search CORE

29 research outputs found

RIME: Repeat Identification

Author: Federico M
Peterlongo P
PISANTI NADIA
Sagot MF
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

We present an algorithm for detecting long similar fragments occurring at least twice in a set of biological sequences. The problem becomes computationally challenging when the frequency of a repeat is allowed to increase and when a non-negligible number of insertions, deletions and substitutions are allowed. We introduce in this paper an algorithm, Rime1 1 Rime is also a reference to Coleridge's poem "The Rime of an Ancient Mariner" which contains many repetitions as a poetic device. (for Repeat Identification: long, Multiple, and with Edits) that performs this task, and manages instances whose size and combination of parameters cannot be handled by other currently existing methods. This is achieved by using a filter as a preprocessing step, and by then exploiting the information gathered by the filter in the following actual repeat inference step. To the best of our knowledge, Rime is the first algorithm that can accurately deal with very long repeats (up to a few thousands), occurring possibly several times, and with a rate of differences (substitutions and indels) allowed among copies of a same repeat of 10-15% or even more

Archivio della Ricerca - Università di Pisa

Efficient sequential and parallel algorithms for finding edit distance based motifs

Author: DE Knuth
E Rocke
EF Adebiyi
JK Lanctot
M Nicolae
MF Sagot
Peng Xiao
Q Yu
S Karlin
S Pathak
S Rajasekaran
S Tanaka
Sanguthevar Rajasekaran
Soumitra Pal
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

EXMOTIF: efficient structured motif extraction

Author: A Apostolico
A Apostolico
A Brazma
A Carvalho
A Carvalho
A Policriti
AM Carvalho
D Thakurta
E Eskin
E Eskin
G Benson
G Pavesi
G Pavesi
J van Helden
J Zhu
L Marsan
M Friberg
M Zhang
MF Sagot
MJ Zaki
Mohammed J Zaki
N Pisanti
P Michailidis
S Sinha
S Sinha
TL Bailey
Yongqiang Zhang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Extracting motifs from sequences is a mainstay of bioinformatics. We look at the problem of mining structured motifs, which allow variable length gaps between simple motif components. We propose an efficient algorithm, called EXMOTIF, that given some sequence(s), and a structured motif template, extracts all frequent structured motifs that have quorum q. Potential applications of our method include the extraction of single/composite regulatory binding sites in DNA sequences. RESULTS: EXMOTIF is efficient in terms of both time and space and is shown empirically to outperform RISO, a state-of-the-art algorithm. It is also successful in finding potential single/composite transcription factor binding sites. CONCLUSION: EXMOTIF is a useful and efficient tool in discovering structured motifs, especially in DNA sequences. The algorithm is available as open-source at:

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

RecMotif: a novel fast algorithm for weak motif discovery

Author: A Price
CE Lawrence
E Fratkin
E Wijaya
FP Roth
G Pavesi
G Wang
GD Stormo
GZ Hertz
GZ Hertz
He Quan Sun
HJ Bussemaker
J Buhler
J Davila
J Davila
Jagath C Rajapakse
L Ming
Malcolm Yoke Hean Low
MF Sagot
PA Pevzner
S Liang
S Rajasekaran
S Sinha
SH Sze
TL Bailey
U Keich
Wen Jing Hsu
X Yang
Z Yao
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series

Author: A Ben-Dor
A Prelic
A Tanay
AP Gasch
Arlindo L Oliveira
C Wu
D Gusfield
D Martin
E Yang
GJ McLachlan
IP Androulakis
IV Mechelen
J Liu
J Liu
J Liu
J Liu
L Ji
L Ji
M Koyuturk
MC Teixeira
MF Sagot
Q Sheng
R Peeters
S Lonardi
Sara C Madeira
SC Madeira
SC Madeira
SC Madeira
SC Madeira
SC Madeira
TM Murali
Y Cheng
Y Zhang
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The ability to monitor the change in expression patterns over time, and to observe the emergence of coherent temporal responses using gene expression time series, obtained from microarray experiments, is critical to advance our understanding of complex biological processes. In this context, biclustering algorithms have been recognized as an important tool for the discovery of local expression patterns, which are crucial to unravel potential regulatory mechanisms. Although most formulations of the biclustering problem are NP-hard, when working with time series expression data the interesting biclusters can be restricted to those with contiguous columns. This restriction leads to a tractable problem and enables the design of efficient biclustering algorithms able to identify all maximal contiguous column coherent biclusters. Methods In this work, we propose <it>e</it>-CCC-Biclustering, a biclustering algorithm that finds and reports all maximal contiguous column coherent biclusters with approximate expression patterns in time polynomial in the size of the time series gene expression matrix. This polynomial time complexity is achieved by manipulating a discretized version of the original matrix using efficient string processing techniques. We also propose extensions to deal with missing values, discover anticorrelated and scaled expression patterns, and different ways to compute the errors allowed in the expression patterns. We propose a scoring criterion combining the statistical significance of expression patterns with a similarity measure between overlapping biclusters. Results We present results in real data showing the effectiveness of <it>e</it>-CCC-Biclustering and its relevance in the discovery of regulatory modules describing the transcriptomic expression patterns occurring in <it>Saccharomyces cerevisiae </it>in response to heat stress. In particular, the results show the advantage of considering approximate patterns when compared to state of the art methods that require exact matching of gene expression time series. Discussion The identification of co-regulated genes, involved in specific biological processes, remains one of the main avenues open to researchers studying gene regulatory networks. The ability of the proposed methodology to efficiently identify sets of genes with similar expression patterns is shown to be instrumental in the discovery of relevant biological phenomena, leading to more convincing evidence of specific regulatory mechanisms. Availability A prototype implementation of the algorithm coded in Java together with the dataset and examples used in the paper is available in <url>http://kdbio.inesc-id.pt/software/e-ccc-biclustering</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

GRISOTTO: A greedy approach to improve combinatorial algorithms for motif discovery with prior knowledge

Author: A Valouev
Alexandra M Carvalho
AM Carvalho
AP Fejes
Arlindo L Oliveira
C Deremble
C Lee
CT Harbison
D Ucar
E Segal
E Valen
F Daenen
G Paillard
G Paillard
G Pavesi
GC Yuan
I Lafontaine
I Lafontaine
I Lafontaine
IV Kulakovskiy
JV Ponomarenko
KD MacIsaac
L Marsan
L Narlikar
L Narlikar
M Hu
M Kellis
MF Sagot
N Pisanti
R Gordân
R Gordân
R Gordân
R Pudimat
R Siddharthan
RA O'Flanagan
RG Beiko
S Sinha
T Wang
TL Bailey
TL Bailey
V Matys
WW Wasserman
X Chen
Y Liu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Position-specific priors (PSP) have been used with success to boost EM and Gibbs sampler-based motif discovery algorithms. PSP information has been computed from different sources, including orthologous conservation, DNA duplex stability, and nucleosome positioning. The use of prior information has not yet been used in the context of combinatorial algorithms. Moreover, priors have been used only independently, and the gain of combining priors from different sources has not yet been studied. Results We extend RISOTTO, a combinatorial algorithm for motif discovery, by post-processing its output with a greedy procedure that uses prior information. PSP's from different sources are combined into a scoring criterion that guides the greedy search procedure. The resulting method, called GRISOTTO, was evaluated over 156 yeast TF ChIP-chip sequence-sets commonly used to benchmark prior-based motif discovery algorithms. Results show that GRISOTTO is at least as accurate as other twelve state-of-the-art approaches for the same task, even without combining priors. Furthermore, by considering combined priors, GRISOTTO is considerably more accurate than the state-of-the-art approaches for the same task. We also show that PSP's improve GRISOTTO ability to retrieve motifs from mouse ChiP-seq data, indicating that the proposed algorithm can be applied to data from a different technology and for a higher eukaryote. Conclusions The conclusions of this work are twofold. First, post-processing the output of combinatorial algorithms by incorporating prior information leads to a very efficient and effective motif discovery method. Second, combining priors from different sources is even more beneficial than considering them separately.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Crystal Structure of the Formin mDia1 in Autoinhibited Conformation

Author: A Seth
A Shimada
AG Nezami
AM Weaver
AS Alberts
AS Kim
AS Paul
BA Johnson
BJ Wallar
BJ Wallar
BL Goode
C Higashida
Chinatsu Otomo
D Pruyne
D Vavylonis
Diana R. Tomchick
DR Kovar
DR Kovar
DR Kovar
F Chang
F Delaglio
F Li
F Li
HN Higgs
HN Higgs
I Sagot
IW Davis
J Lu
J Navaza
JA Spudich
JB Marchand
K Cowtan
KG Campellone
KH Gardner
LR Otterbein
M Evangelista
M Evangelista
M Lammers
M Lei
MA Chesarone
MA Pufall
MF Carlier
Michael K. Rosen
Mischa Machius
N Watanabe
PD Adams
Petri Kursula
R Habas
R Rose
S Narumiya
S Romero
S Zigmond
T Matusek
T Otomo
T Otomo
T Shemesh
TA Jones
Takanori Otomo
TD Pollard
TD Pollard
Y Xu
Z Otwinowski
Z Otwinowski
Publication venue: Public Library of Science
Publication date: 01/09/2010
Field of study

Formin proteins utilize a conserved formin homology 2 (FH2) domain to nucleate new actin filaments. In mammalian diaphanous-related formins (DRFs) the FH2 domain is inhibited through an unknown mechanism by intramolecular binding of the diaphanous autoinhibitory domain (DAD) and the diaphanous inhibitory domain (DID).Here we report the crystal structure of a complex between DID and FH2-DAD fragments of the mammalian DRF, mDia1 (mammalian diaphanous 1 also called Drf1 or p140mDia). The structure shows a tetrameric configuration (4 FH2 + 4 DID) in which the actin-binding sites on the FH2 domain are sterically occluded. However biochemical data suggest the full-length mDia1 is a dimer in solution (2 FH2 + 2 DID). Based on the crystal structure, we have generated possible dimer models and found that architectures of all of these models are incompatible with binding to actin filament but not to actin monomer. Furthermore, we show that the minimal functional monomeric unit in the FH2 domain, termed the bridge element, can be inhibited by isolated monomeric DID. NMR data on the bridge-DID system revealed that at least one of the two actin-binding sites on the bridge element is accessible to actin monomer in the inhibited state.Our findings suggest that autoinhibition in the native DRF dimer involves steric hindrance with the actin filament. Although the structure of a full-length DRF would be required for clarification of the presented models, our work here provides the first structural insights into the mechanism of the DRF autoinhibition

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Septation of Infectious Hyphae Is Critical for Appressoria Formation and Virulence in the Smut Fungus Ustilago Maydis

Author: A Bottin
A Brachmann
A Brachmann
A Hlubek
A Mendoza-Mendoza
A Schirenbeck
A Yonetani
B Sandrock
B Schulz
Björn Sandrock
Brett Tyler
C Bechinger
C Böhmer
C Böhmer
C Veneault-Fourrey
CA Kumamoto
CF Robinow
Christian Böhmer
CT Ingold
D Chalkia
D Lanver
D Pruyne
D Pruyne
Daniel Lanver
DG Saunders
DH Castrillon
DR Kovar
F Banuett
F Banuett
F Chang
F Chang
F Li
G Döhlemann
G Loubradou
G Steinberg
G Weinzierl
H Imamura
HN Higgs
HP Schmitz
I Alvarez-Tabares
I Sagot
J Kämper
J Kämper
J Schirawski
JC de Jong
JJ Christensen
Johannes Freitag
K Heimel
Kay Oliver Schink
KM Snetselaar
KP Dixon
M Bathe
M Böhmer
M Bölker
M Evangelista
M Evangelista
M Evangelista
M Kikyo
M Mahlert
M Scherer
MF Olson
Michael Bölker
MJ Kershaw
P Wadsworth
PR Day
R Kahmann
RA Wilson
RW Emmett
S Oliferenko
SL Tucker
T Garcia-Muse
T Kamei
T Spellig
W Choi
W Liu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Differentiation of hyphae into specialized infection structures, known as appressoria, is a common feature of plant pathogenic fungi that penetrate the plant cuticle. Appressorium formation in U. maydis is triggered by environmental signals but the molecular mechanism of this hyphal differentiation is largely unknown. Infectious hyphae grow on the leaf surface by inserting regularly spaced retraction septa at the distal end of the tip cell leaving empty sections of collapsed hyphae behind. Here we show that formation of retraction septa is critical for appressorium formation and virulence in U. maydis. We demonstrate that the diaphanous-related formin Drf1 is necessary for actomyosin ring formation during septation of infectious hyphae. Drf1 acts as an effector of a Cdc42 GTPase signaling module, which also consists of the Cdc42-specific guanine nucleotide exchange factor Don1 and the Ste20-like kinase Don3. Deletion of drf1, don1 or don3 abolished formation of retraction septa resulting in reduced virulence. Appressorium formation in these mutants was not completely blocked but infection structures were found only at the tip of short filaments indicating that retraction septa are necessary for appressorium formation in extended infectious hyphae. In addition, appressoria of drf1 mutants penetrated the plant tissue less frequently

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Genome Sequence of the Pea Aphid Acyrthosiphon pisum

Author: Abdel-latief M
Aguilar J
Alioto T
Altincicek B
Anselme C
Ashton P
Ashton PD
Atamian H
Barribeau SM
Bell SN
Bickel RD
Brault V
Brisson JA
Burke G
Butts T
Caillaud M
Calevro F
Calevro F
Camara F
Campbell P
Carolan JC
Cass B
Cazzamali G
Chacko J
Chandrabose MN
Chang C-C
Charles H
Chavez D
Chen H-C
Christiaens O
Colella S
Collin O
Consortium IAG
Cortes T
Cottret L
Cree A
Dale RP
Dang PM
Dao MD
Davies TGE
Davis C
Davis GK
de Vos M
Dearden P
Degnan P
Diaz J
Dinh HH
Dombrovsky A
Douglas A
Douglas AE
Duncan E
Duncan EJ
Edwards OR
Ermolaeva O
Evans J
Febvay G
Febvay G
Fenton B
Ferrier D
Field LM
Fitzroy CIJ
Fowler GR
Fukatsu T
Futami R
Gabaldon T
Gabisi RA
Gatehouse JA
Gauthier J-P
Gerardo NM
Ghanim M
Gibbs RA
Gilbert D
Gordon K
Grimmelikhuijzen CJP
Guigo R
Hansen KK
Hauser F
He X-L
Heckel DG
Heddi A
Hedges D
Hilgarth RS
Hines S
Hitchens ME
Hlavina W
Huang W
Huerta-Cepas J
Hume J
Hunnicutt L
Hunter W
Hurwitz B
Huybrechts J
Iga M
Ishikawa A
Jander G
Janssen R
Jaubert-Possamai S
Jhangian SN
Jiang H
Johnson AJ
Jones A
Jones DH
Jones DH
Joshi V
Jr RSD
Jr WRL
Ju H-J
Kaloshian I
Kamins A
Kamphuis LG
Kapustin Y
Kiryutin B
Kitts P
Koga R
Kosarev P
Kovar C
Kudo T
Kudo T
Kushlan PF
Latorre A
Lavenier D
Lee SL
Legeai F
Leonardo T
Lewis LR
Liu R
Liu Y-S
Llorens C
Lopez J
Lozado RJ
Lu H-L
Macdonald S
Maglott D
Marcet-Houben M
Mariotti M
Martinez-Torres D
McCutcheon JP
McGregor A
Miura T
Miyagishima S-Y
Moen C
Monsion B
Moran N
Moran NA
Morgan MB
Morioka M
Moya A
Murphy T
Muzny D
Nakabachi A
Nazareth LV
Nguyen NB
Nicolas J
Nikoh N
Okwuonu GO
Ollivier M
Ortiz-Rivas B
Parker BJ
Patel BM
Pechuan X
Perez-Brocal V
Permal E
Pignatelli M
Price DRG
Pruitt K
Pu L-L
Quesneville H
Rahbe Y
Ramsey J
Reardon KT
Reeck GR
Ren Q
Richards S
Rispe C
Rizk G
Robertson HM
Robertson HM
Robertson HM
Robichon A
Rozas J
Ruiz SJ
Sagot MF
Santibanez J
Sapojnikov V
Sattelle D
Schneider M
Schwartz J
Seah S
Shigenobu S
Singh K
Smadja C
Smagghe G
Smith J
Solovyev V
Souvorov A
Spragg CJ
Srinivasan D
Stafflinger E
Stanke M
Steffen D
Stern D
Tagu D
Tamames J
Tamarit D
Tamborindeguy C
Thibaud-Nissen F
Thomas G
van der Zee M
van Fleet E
Vattathil S
Veenstra JA
Velarde R
Vellozo A
Vellozo A
Vieira FG
Vilcinskas A
Vincent-Monegat C
Walsh TK
Warren JT
Wilkinson TL
Williamson M
Williamson MS
Williamson S
Wilson ACC
Wilson M
Wolstenholme A
Worley KC
Wright RA
Zhang J
Zhang L
Zhou J-J
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems

Fraunhofer-ePrints

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

UPF Digital Repository

White Rose Research Online

Rothamsted Repository

Publikationer från Uppsala Universitet

Ghent University Academic Bibliography

Royal Holloway - Pure

Secretaría de Estado de Cultura

Diposit Digital de la Universitat de Barcelona

University of St. Andrews - Pure